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USES AND ABUSES OF ADVERSARY EVALUATION: A CONSUMER'S GUIDE 



^Even. in the infant field of educational evaluation, the so-called adversary* 
model of evaluation stands out, as a relative newcomer- Guba (1965) suggested over 
a decade ago that educational* evaluation might well adopt or adapt aspects of«the ' 
legal paradigm. Apparently it was five years later before the first adversary 

evaluation^ education was ionducted (Owens, 1§71). Si&ce then, onfy ; a handful . , 

■ . <* . * 

of e valuator's .have either conducted adversary evaluations' orlurftten about them. 

In a recent Informal survey, *Owens and Hiscx>x (197?>-id^fttrfi*8d only "sis evaluations 
" i ' . • • ' ~ I k * * 

whichjthev judged to be t^aty adversarial in nature (Owens, 1971;,Hiseo:<-and Owens, 

'1975; Wolf, 1975- Stenzel, 1976; ^ev^pe, 1976; and Northwest Regional Educational 

$ • i v ' 4 1 

Laboratory, 1977)* Even if one included other studies. which might be viewed as 

.adversary evaluations- (e. , Stak£ and Gjerde, 197St^Kouril5ky and Baker, 1976), * 

it would seem safe to state that no more than eight or: ten such evaluations have 

been conducted throughout fhe nation. Add the 'thoughtful conceptual work on 



adversary evaluation (e.g., Owens, 1972; Wolf, 1973), and there is stirf^a paucity, 
af information about this widely publicizechevaluation method- Much more thought 
•and experience is necessary.before it will be clear whether, the adversary, method 

% \ 

has the potential claimed by its proponents. Hopefully this symposium- will help 

' *" . 

broaden the dialogue about adversary evaluation Jmd lead to more careful analysis 
• * '< • ' 

and experimentation in this area, * , , • , fc - 



x ^There are evaluations using multiple advocate teams, eac^h of which develops 
an independent position which may ^r may not .bo inoppositib^jto other t^amg^'positions 
(e, g. *, Rerinhard, 1974). These are not included fn this papar as examples of adversary 
evaluation. p • „ . , 4 +. 



The Focus of this Paper 

' . Whed this symposium was planned/ others were to argue for and agaidst the 

adversary approach, while our assignment was to take the more neutral ground *> 

and provide an objective analysis of situations' and settings where adversary ^ , 

* '" ^ . » 

evaluation would-be beneficial and where it would be ill-advised. In the interim 

since that planning, however, we have both been involved in'a large-scale 

advexsa*ry evaluaiion, one as a member of pne 0/ the opposin^evaluition teams • 

* - * • 

and one as co-director of the overall study, and" arbiter in disputes srnd negotiations 

between^thef teams. 2 Although we^oth began'by believing the, adversary *pproosh 

would be very useful for that particular evaluation, the-<Ufficuities experienced 

as we conducted the study were severe enough togive us serious second thoughts 

about the whole business of "adversarying." In fact; in the midst 6? the evaluation, 

we were sufficiently disenchanted that it Was tempting to agree with Popham ajid 

Carlson (1977) in condemning the whole approach. . 1 

The presentation of the 3 .on 2 final adversary reports and the "aftermath" 

of the evaluation have changed our perspective considerably, howe'ver. We have 

also been influenced by reactioas of many key people affected by the evaluation. 

Although we still have reservations and 'cautions to'ShaJre with you, we are con- 

vinced that the basic concept of adversary evaluation has r£al merit, if it is^ 

applied with prudence and judgment .to those situations .where it would be both - 

* ^ / - 

2 This evaluation, which, serves as the basis for -much of the experience 
reported in this paper,' is the Northwest Regional'-Educational Laboratory (XWREL) 
evaluation of the Hawaii 3 on 2 program, a lar£e, controversial starewide team 
teaching program in the primary grades (XWREL, 1977), This study will hereafter 
*be referred to as the 3 pn 2 evaluation. / \ 



appropriate ap.d advantageous. ' There' is clearly real potential in the- adversary 

approach for making ^valuation findings more meaningful to educational cec/sion 

makers. This does not mean that we accept without reservation all the claims 

* * l * * 

made by proponents of adversary evaluation ~$t?%.<t Wolf, 1975;- Wright and * 
Sachse, 1977}.-' We are frankly' fearful that overly zealous supporters may fail 
to' be sufficiently introspective to find and correct critical flavvs'in the adversary • 
'qoncept. We are equally fearful that preoccupation With the paraphernalia of the 
adversary model could cause evaluators to overlook the real benefits' and problems 
that can "result from its.use. * 

* l * * . 

< ""*'* 

In^the remainder Qf this paper, we will present and discuss .nine, issues 
or questions which we think are central to the.futurfe of adversary evaluation 
in education. Ia stating our position on each isSue, we hope to generate produc- 
tive dialbgue which can lead to development and refinement in the use of 
adversary evaluation/ j 

) ' A * 

1. Is There a Clearly Delineated Adversary Model of Evaluation high 1 

Evaluators or Dec ision Makers Can Apply ? s 
=S5P ^ ^ 

One of. the authors* has argued elsewhere (Worthen, 1977), that tiie tern\ , 
"evaluation models" is a misnomer when applied to the current conceptual za- 
. - ti&ns about educational evaluation. This argument, which will not be repeated 

. here, is>not intended to denigrate the largely helpful" suggestions which arise 

■* ■ - ♦ 

in the literature, 'but only to correctly describe them-for what they are and 

I' * ^ * | * 

are not. ^In nt> instance" is the term mode] less appropriate than in the case 

-' .;, A - • ' _ • . 

of the so-called adversary model of evaluation: None of the criteria for 

models stated by KapI^.(1964)^or other philosophers o; science is met; . 

adversary evaluation offers no, unified framework or coherent set of principals. 

*V ft is pnly % ruhriV-under which to describe a collection of divergent approaches 



. which might loosely be referred to as - adversarial in nature/ In its broad 

J ' % 

sense, the term refersto' all evaluations where there is planned opoosition 

in the points of view of different evaluators or evaluation teams. The.Websterian 

sense $ "contending with, opposing" is central to this general definition. The 

fact that an evaluation approach includes a planned effort to generate opnosing ^ 

- 'points of view within the overall evaluation is the sine qua non here, whereas , 

* the labeling becomes less inywrtatit. ' - ' 

Ap*Owens and Hiscox's (1977) descriptions make abundantly cle^r,* none of 

the prior adversary evaluations (or writings on which they are cased) are 

sufficiently weli developed to set a standarcJ for future efforts^or to serve 

as a model of even the specific adversary approach employed. As yet there 

is little beyond, personal preferenofe to determine whether adversary hearings, 

• debates or other approaches might be best in specific evaluation settings. 

>EJach approach should be further developed, applied in varied educational 

contexts, and studiedio determine its relative utility under varring conditions. 

I 

Given sufficient experience, Darwinian principles might apolv aad result in 

one specific adversary method proving best for most educational evaluations. 

Ipr the-meantime, it seems most defensible to use the term "adversary 

» * 

evaluation 11 in a broad sense and avoid the artificially precise and misleading 
' ... ■ * 

"^terminology of "evaliiation model. ,T • % 

' The remainder of this paper assumes, the notion x>£ planned ooposition 

„ among eyaluators to be the only requirement for adversary evaluation. ^ 



?The full range of^crrms this flight take must await further development. In 
the. meantime, it is obvious that some of the jHscu'ssion injtHls paper will apply 
mor^^rectly to one type of^advQ^sary evaluation than another. We will leave 
'it for^other^'to t v ease out those specific applications. * , - 



t 



1 ERLC 



/ * * « 

2. Is the Lesral Paradigm the Rest Approach to Adversary Evaluation in Education? 

, ~- ■ - rr , r ; ; ■ ^ — — r— ^ x 

Much of the effort to apply .adversary, evaluation in education, has drawn on 4 

courtroom procedures, wifh an. advocate and an adversary questioning and cross-. 

examining witnesses and .applying rules of admissibility of evidence customary 

i>n legal proceedings. If you were to ask any t&n 'educational evaluators to < 

describe the adversary evaluation approach, nine would probably talk in terms 

• * * " <* * 

of witnesses, crpss-examinatiotii,. the jury system, and so forth; 

Tjie legal paradigm has intriguing possibilities for sbmfe evaluation situations, 

and Wolf (1973) has provided p. good analysis of certain of these. R'e are not 

inclined, however, to view the legal paradigm as necessarily the best pattern 

for adversary evaluation. We tend to agree with Levine (1974-) in favoring 

adversary evaluation more as a broad philosophical orienta^ibn which may be * ' 

expressed* in many for^is. For example^ pross-examination and juries may _ 

9 be appropriate in applying the courtrjoom model ta educational evaluation; but « 

* • 

they are hardly essential to conducting an adversary evaluation. In fact, one 

» *, \ • 

■of our greatest concerns is that evaluators will seize on some of the more . 

trivial features of the courtroom and fail to isolate an^extract those 

adversarial aspects which might be most pertinent in educational evaluation. 

. All that we have read^and seen suggests to us. that 'rigid adherence .to the ^ ' , 

legal perspective is likely to result in we a!T adversary evaluations and an 
* '»,,*, 

eventual rejection of tfie whole concept.^ • . • v Y » 

* It might be* useful to illustrate a few aspects of the legal system whiqh 
seem to us unnecessary or downright inappropriate in educational evaluation.* 
Firstr-we believe some of our colleagues should be *c hided for their 
. • compulsion tq replicate even the .theatrical aspects oT the courtroom in' 6 
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their adversary hearings. Cloaking the person presiding over an educational 1 

.hearing in q. black robe seems as pretentious and inane as placing powdered 

wigs on senatprs, presiding over congressional hearings. 

^Second, we- believe use-of the lejjal modal can result in a seductive slide 
1 

into what might be termed an "indictment mentality," which can do a clisservide 
both to evaluation effort^ and to the programs being evaluated. Adversary, 
evaluation literature which invokes the legal model tends to use terms such 
as M, statement t of charges" (Hiscox and Owens,. 1975), ''guilty or not guilty M 
(Levine, 1976), a,nd the like.. That^orientation may be appropriate when there 

is'a formal complaint against an educational program, as in the recent 
investigation of the University^of Massachusetts School of Education programs. 
But formal complaints, plaintiffs and litigants are conspicuously absent in the 

.typical educational evaluation — and rightly so. Evaluation in education should 
aspire to be an instrument for improving educationaLprograms, not for 

determining their guilt or innocenc/. ^Although it is true^that evaluators 

1 , / 

must of necessity render judgments of worth, that seems to us a far cry 
from invoking a model in which the program stajpds as "accused" on specific 
charges^ K • . \ 

It is not just the vocabulary of the legal model tfyat is problematic, but 
its characteristic of serving only wh^n there is a problem to be solved, 

I # • 

There is already too much of a tendency to*view evaluation as something 
you do when a program is in trouble, when there is a crisis or failingAvhich 
requires correction. It would be unfortunate if this tendency were exacerbated 
anJe valuations 'conducted only when a complaint has beprt lodged, ari accusation 
leveled, an offending program accused. It is precisely this orientation which 



.we. fea^ n#iy be a side effect of basing eyaluatipnsqn the legal model, or for 
that fact, on any model which is meant to be applied only in problem-solving 

S' , * 9 

* 0 m 

or crisis sit^qiion^' It would be far moire salutary if educators came to view 
evaludti&n as^somethid^which was routinely carried out to -help them keep, 
their programs boe rating at maximum effectiveness and efficiency. If 

• • • ' . ' c 

advocates of the judicial approach respond that they only intend legal concepts ' 

to pe applied to adversary evaluations which are conducted whe're complaints 

and charges are involved r many of the-abQve concerhs would be eased. 

Obviously, one should not dismiss all aspects of t'pe legal paradigm as *3 
* ■* 

Inappropriate- For example, cross-examination (properly conducted) would 

seem to have a potentially useful role in evaluations which use human testi- 

• • • 

mony as a* major source of data. Of course one can use cross-examination 

by adversaries without requiring full or even partial courtroom procedures. 
Witness congressional hearings or interviews bbnducted jpintly by partisan 
interviewers. Wolf (1975) and Hiscox &ad Owens ,(1975) have shown that one 
can adapt portions of the legal model without adopting it ia its entirety. 4 

' Hiscox and Owens (1975, p. 8) list five advantages fihich'could accrue 
fgrom^ssening adherence to a^strict legal model in adversary evaluations. 
Briefly, they are; (I) adversary evaluations could.be conducted with lower 



investments of time and money; (2) adversary evaluations would be less 
dependent on availability of'trairted legal professionals; (3) adversary hearings 



^Even he^e we believe there needs to be^more attentior^to developing an^ 
adversary evaluation^approach which would^be suitable in education, for routine 
non-problem settings, without straining too hard to bend to Guruse an extant \ 
.approach which is built on assumptions' and for situations markedly different 
from those that apply in most educational evaluations. 



could, be more easily understood bv evaluators acid decision makers; *(4) greater 
flexibility in addressing non-dichptomous issues would result; and ^5) adversary 
evaluations or reporting could be conducted without fojmat hearings. We agree 
vwith these points. Although we believe the legal paradigm has merit as a ^ 

Jieuristic, we also feel it carries many features which' could be detrimental 

' * • ' v* • * 

to educational evaluations. We hope others^vill view it with appropriate 

^skepticism arid entertain other alternatives before deciding which adversary 

approach is most suited to their needs in educational evaluation. * . 

DoeS Adversary Evalurttfon Provide Decision Makers with the Full Bange 

of Information Needed to Make Info rmed Decisions ? 
■ 1 s = . 

During otir adversary evaluation of the Hawaii 3 on 2 Program, we 

f • . . ■ . 

worried considerably about whether the strong pro and con positions which 

were taken might increase the probability that an dxtr^me jcjecisioa might be 

f . ■ 1 > v 

made without due consideration of the full range of possible decisions which 

might be made. Would adversary evaluation result in an unwitting loss of the 

X 

middle ground? In the- typical evaluation, where an evaluator is charged with 
strict neutrality and objectivity,, the middle ground might well receive as much 
attention as tfxe ends of the spectrum. But, what about adversary proceedings 
where the antagonists anchor the ends of the decision spectrum and choose- 
to ignore the middle? Which best serves the decision maker, conflict or 

* ' * * a 

compromise, contrast or convergence, polarized positions or plea-bargaining? 
r 

Does the adversary approach lend itself to the type of diagnostic information 

whicK is so often needed by the thoughtful decision maker? 

' Wrestling with these questions forced us-to , examine' them in terms of V.uree 

. . ♦ 

other questions: (1) does adversary' evaluation provide a solution to t^e problem 



of evaluator's biase3 -slipping unnoticed into the evaluation/ (2) is there a . , 

possibility for convergence in adversary evaluation; and (Z) should an effort 

be made to £arese unequally sti*ong^positive and negative arguments it) 

adversary evaluations? Each of these areas is discussed briefly below. 

Adversary Evaluation and Evaluatov|s Biases . Proponents of adversary 

evaluation (e.g. , Wright & Sachse, 1977) have argued that evaluators not' 

'the' impartial, objective paragons they* purport to be, and that they bring witfi 

them certain biases, .often unrecognized, that- influence their findings. t * 

*+ ' * * • 

Adversary evaluations are proposed' as a solutioil since they intentionally 
cbunter-balance biases, One*evaluator (or team) is aligned to present the 

y 

positive case and is expected to # be biased in Javor of tha* program-, .while - 

another ^ expected to be opposed to the program and be biased against it. t 

The object then, is* nof eliijiinafion of bias but .rather balancing bias ang making 
* * * * • - . ' i& " 

it public. Of course, still other.biases ,and predispositions of the evaluators 

, v - * 

are unlikely to be affected by the mere assignment to a position. An 

individual evaluator tJ s biases will obviously influence the rigor with. which 

he can defend or criticize a program. Imagine the plight of* Ralph Nacjer if 

he were assigned to defend a program or product. There is no great insight 

here, merely, a reminder that bias is not magically eliminated or rendered 

♦ 

inoperable by efforts to balance it. 1 ' . 

. Convergence in Adversary Evaluation , In moving It om the usual 
evaluation stance of neutrality to that of haying two biased protagonists, 
educators stand bo^h to gain and lose. The gain is likely to be an increase* 
in the spectrum of data and interpretations provided to decision makers; 
few other evaluation approaches seem likely to push as far in both tiie 



x positive and negative directions as .the adversary method. The uoss .could 
easily come from unnecessary poltfrizatiQn that shifts attention a^vay from the 
-^jddr^gr&und so often essential to rational decision making.* ' 

Many adversary supporters (e/g. , Hiscox and Owens, lSf75) HaVe claimed 

\ - * * \ 

that conclusions and recommendations' agreed to by both sides may ba held 

• *\ 
with greater confidence by a decision maker. While this seems patently t 
t> * 

sensible, t experience with adversary evaluations suggests such agreement 

~is unlikelv to be a spontaneous by-product of the -sparring and jousting that 

often occurs between adversaries. 'Most adversary approaches have a 

competitive element; bne^of.the adversaries will probably win and the other 

lose. When competition is high, .cooperation tends to be lowe*. There is . 

• , - \ • • 

* 4 * 

less of an inclination to'Search for agreement than is true under -more 
collaborative ci-rctomstances. In highly competitive. 'evaluations,* mutual \ 

• , ^ * ■ * v* 

' . , • '-V N *• ' 

agreements are* often aband?&e4 ia the. adversaries' ru^h to dispute each 

„ ' ' ±' * * 

• cfe«s«ari3Ujapes of turning k to their^Own advtfhtage* When * winning is at 

stake, even "black is blacly" pronA&ncements are sometimes .questioned by . 

/ •» • -.^7 ... .*..•• . ; . - • ' 

seemingly ration^ro^pobeats* Antagonists are^.often leery of a^reemerftsr, 

• even about things they'may bo$h believe, especially if they, construe the 



agreemeat ks potentially injurious to th^ir case*(s). t Shared conclusions^ 

in adversary evaluation are not easy to come by/ Most adversary approacja^^ 



could profit from ; a better mechanism for seeking and reporting areas of r 
agreement* ' * .* , 

In the Hawaii 3 on 2 evaluation,, it was decided that presenting the > r , 



9 



strongest possible pro and con cases would ifest serve the needs of the 



decision makers. As the evaluation progressed, it became apparent that 



ERIC • 12 * * io... 



"the evaluation teams were trending to\ykrd "all or none M recommendations— 

- - • ■ 

' maintain" the prograin in its entirety or eliminate it completely. With that * ' ; 
posture, 'it became; difficult 'to. get either team very "psyched up M about evalu- . 

•44"' " , 

-ation appf oaqhes which ferreted out "features of the program tvhich could be 

V • • ■ . . 

jettisoned without loss, or features -which should be retaioed even if the , . 
overall program were scrapped. Several. members of the evaluation team 
worried that this approach would result in loss of important diagnostic 
information which did not support either extreme position. ° 'The polarized 
\ Report was very well received in Hawaii, however, and only twb board members 
or administrators complained about! the fact that the evalyatiQn would lead to ^JgL 

either a "go" or "no go T! decision. Although we believe the 3 on 2 evaluation 1 ^ 

, V 
* was a good one, it would be much better in our judgment had some way been 

y ^ found to converge on areas where both teams agreed there Were strengths or , 

. * weaknesses. * o '\ 

There is some empirical evidence which bears on the reconciliation of 

views between evaluators. Kourilsky and Baker (1976) reported that college 

students' produce significantly better evaluations of a project when using,an - 

♦adversary approach than under two other approaches which do not involve 

confrontation. Their adversary treatment required that adversaries reconcile 

their views and produce a single recommendation to the decision ijiaker. This 

method was found to produce significantly superior results over othei* less 

adversarial methods. Unfortunatfelj r , the stu.dy did not include a* comparison y 



5 ^Iuch such information was included in the technical evaluatfpji report, bat 
since it was not presented in the more provocative adversary reports, it seems* ^jf*%* 
operationally to have had little impact , on subsequent decisions about the progratfu 



treatment in which adversaries were not asked to converge, so one pivotal 



^ bit of data is sfill^acking. ' ^ ^ ■ 

• Relative Strength of Adversary Positions , It may not Jjjfr an explicit ~ 9 . 

assumption of the'approach, but many adversary evaluations proceed as if 

therf is an unspoken obligation' to present two equally convincing cases, one 

pro and one~cotr.~Of course no ong would tolerate an adversary who slacked 

* ■ 
and presentee] a weaker case 'than was deserved on the basis of the data; but 

what about the advocate who errs in the other direction, who feels compelled 

"to -keep up with the opposition, even if it means straining or ignoring the „ 

data? Here is where we part philosophical company with some colleagues 

who seem to sincerely believe tkat a program* is not repres&nt^Twell unless 

* * " r • - 

both sides are argued equally convincingly- That orientation strikes us as 
■ —< », ^1 „ * • 

.+ ^ * 
appropriate in a forensic society w hor e- the result of the debate seldom ' 

impacts on the proposition, but not in an evaluation where the outcome will , 

influence real programs and real people, 

• Like the legal paradigm, the? debate model also carri&s with it many ^. 
irrelevancies that should be strained out before th? model is applied to ' 



education. The' critical difference is in the f^ct that the touchstones of debate 

- * "* . ' •, . 

atfe polemics and persuasion^not, truth, which is central t6 the validity of 

- * j ' * - 

evaluation studies. Debates surely use facts and cannot normally 'afford to 
ignore them, at least \iot totally, Bui seldom is the debater forced to adhere 
as tightly to the plain unadorned facts as is the conscionable e valuator. Logic 
can provide a permissive climate for, manipulating the data until its fd'rm x - 
is favorable* Probably mo re Sophist rv results from debaters' perversions 
of syllogistic logic than aijy other self-deception known to man. At least par: 



'of this tendency must be traced to efforts to build strong cases on flimsy 

foundations. ' ' /"•**• 

Our recommendation in this area;would Be. for decision makers to think ; 

"carefully about the charge*they give to adversary 'e valuators. We believe 

the appropriate mandate is that of presenting the most positive and most 

negatiVe, cases possible on the tiasis of the evidence which exists. Within 

that framework, t^e evaluator iM^hc be encouraged to employ all the tech- . - 

niques of persuasion possible, just* so adversarial zeal does not' lead to 

flights of fancy or specious arguments that exceed the evidence. Of course,. ' 

* * » 

one could depend on rebuttals Or cross-examination to expose fallacies anqT* # 
errors introduped by overly enthusiastic adversaries, but that dependence 

seems optimistic. • It would be better to require documentation and evidence 

• » 
|or arguments at the outset rather than to allow unsubstantiated assertions 

to become partf of the substance that is contested in an adversary evaluation. 

• ■ z • * 

If such mandates to evaluators were made clear, then no evaluator would feel 
compelled tp" fabricate a strong pqsitive case when none exists, where the 

overwhelming weight of the eVf#en£e reveals the product or program to be' 

■ A % i , 

without redeeming features* or vice versa, * 1 * 

Does Challenging of Evidence in Adversary Evaluations Reduce Their Credibility 

• < "■* * 

'Data in a typical >£valuH?idn*are only contested by outside critics, usually 
^after the fact, *In adversary evaluations, the data themselves can become a - 
source of dispute between adversaries, and this has both pluses and minuses. 
Fpr example, 'one can argue (at least in educational evaluation) that all data 
and the instruments and designs. that produce them are open to some degree 0^ 
question. Therefoije' they may as well be questioned by opponents within th& 
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adversary framework as by possiblv less- informed persons at a later point in 
time. There is also a potential, however, that disputes among adversaries * 

over the ^SS^clity of data will, at best, shift attention* away from the, substance 

t f » 

of the evaluation to it$ processes, and, at worst, will jeopardize the credibility 

of the evaluation. Imagine an evaluation with two major types of data, let us 

& ~* say test scores and observer ratings'. Imagine that one adversary makes every", 

effort to discredit test scores (which aot-coinci^entally favors the opposition) 

While the* other seriously questions -the observers-ratings. If both evaluators * 

are skillful at pointing: out and perhaps dramatically magnifying the flaws 

which exist in mos^data/^llection^echniques, the n§t result could well be 

§ 

to discredit the entire data base and destroy the credibility of the study. For 

9 

example, Popham and. Carlson (1977) stated their view that the arbiters ia the - 
• %' 3 on 2 evaluation exe&ised goocf judgment allowing their team to argue that 

^ - - . • • 4 

'\ the tests Msed in the evaluation were invalid* Perhaps^ but at least some in the 

v . - * * • * A 

Hawaii Sta^|Board of Education felt differently* When asked in a recent * 

■» * , * *' _ — 

questionnaire, "Does the advocate-adversary approach provide decision makers 

~* • ' . 

with the evidence they need to make a choice?", one board member wrote, 
^"Not when the integrity of the evaluation instruments is attacked* That attack 
on. the instruments completely destroyed the credibility of*the study T s overall 
fincjings and, in % 'politically charged issue, allowed board members to igaorg , 
the evaluation and do whatever they wanted." That may be an over-reaction, 
but'it does demonstrate the risks of allowing opponents to extend their contentions 

K ■ • \ 

to the data base. In the heat of competition,* methodological pimples have a 
way of getting portrayed as terminal iMness&s7\ , 
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. Now this should not be .construed as a suggestion that bad instruments pr 
/ ' data should be tdle rated. The point is simply that techniques should be built 

into adversary evaluations to produce a common core of data that both sides 

r - ' . * . ^ m • 

1 ' would accept as valid for purposes of judging. the program. Variables and the 

best methods for measuring them should .be agreed-upon in advance, not 
determined on the basis of partisanship. Surely evidence should be challenged- ^ 



9 



a^fecl^nly the! most solid used as a basis for evaluative judgments, but it would 
seem wisSno deal with this issue early in an adversary evaluation^ the focus 
in the final stages can be on inferences, arguments and judging the program 
rather than qua'rreling about the adequacy of the evidential basis for th£ 
' evaluation. We all.e^oy, the cleverness of the defense attorney \vho "holds up 
an optometrist's chart at, the. back of the courtroom to prove the prosecutibn 
.witness, is myopic 'and could not possibly, have identified the defendant at the 
distance claimed, * High drama should be Reserved Jfor,the TV courtroom; 
in educational evaluations, such faulty witnesses should be. dispensed with . 
% much earlier and not at the final report-stage. V 

Considerably more thought must be giveir in this area ta working out 
rules for judging admissibility aqid validity of evidence in adversary 
evaluations. 

5. In What Settings a^d Under What Circumstances Would an Adversary Evaluation 
Jqe Appropi'iate ? s " 

Even the most e^husiastic advocate of adversary evaluation is unlikely to 

argue that the approach would be appropriate in any evaluation. In an effort 

to get others t opinions on this issue, a questionnaire was dsvelofcsd 6 and sent 

\ 

to key figures in Hawaii, both decision makers and evaluators-. They were 



s 

i 



l Y" f 

6xhis questionnaire was developed jointly by one of the author« and William J* Wright. 
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asked when they thought it would be appropriate to use' an adversary evaluation, 

A majority of the respondents indicate ^they §a\tf an adversary evaluation 

as appropriate in the following ins*aa<jres: „ 

- a. When the program is^ontroversial and people are'polarized in thfeir 
\ " ,opin?ons over it (93%). 1 r 

b. When decisions,must be made about whether to continue or terminate 
a program (81%) .j- % . -j 

r f ' ' . 

c. When the program^is large and affects many people (77%). 

d. When there are many different audiences for, the evaluation ^ 
report (65 v>), and!" . > " : \ 



e. When the evaluation^ is\o?ducted by persons eternal to the* /' i 

program (56%). r ^ * v % / l% 

• ■ "v • * 

Very few respondents felt adversary e valuations ^uld be' appropriate when 
.the evalQation was conducted by internal e valuators (15%) for purposes 'of making 
decisions about hQw.to improve the program (15%). 7 

/Thesfe reactions %id our own biases lead us to suggest. several factors 
which we think should govern decisions about when to use the adversary 

• \ * The Decision. As implied above, adversary evaluation would seem less 

I relevant for most' formative decisions than for summative decisions about 

program continuation.' Using adversary evaluation also assumes the full 
, *. 

range of decision alternatives is available to the decision maker. Aside from 



^Written comments suggested some respondents reacted this waj' because of 
their perception that adversary evaluations give up the diagnostic middle ground . 
Relevant to program improvement decisions. ^ • \^ 

8obviousl5' the^e points may need to be altered somewhat if one chooses to 
look at a specific type of adversary approach, such as the, legal or debate model. 



th§ intellectual enjoyment, arguing •from adversary positions is of -dubious 

t ^ *, 

^ » * - * * ^. •?. w 

worth if one side has no chance, e. g\ , lack of funds dictates that a program . 
be terminated regardless of its quality. If clearly competing courses of " . 
action arq not available, the adversary approach has little to recommend it. , 

The Object of the Evaluation . The Hawaii resooadents felt adversary 
evaluation was most appropriate for. large, controversial programs which 
had a variety of interested audiences. We" tend To agree. The adversary 
approach is an ambiriduj^ costly and sometimes cumbersome method. As 
such, it should be reserved for cases which warrant the increased investment 

• - v 

of time and money and>here its use would add significantly to the results of 
vthe study. It would b% presumptuous for us to suggest types of programs 

^vhere it should be used,' but it seems clear that one does not' wheel out 

(■-' - ' . . 

x 

lieavy artillery for every minor skirmish. ' » - * 

Clarity of Issues. Adversary evaluation loses its. meaning unless issues 

to be addressed by the adversaries ^re clearly identified and adhered to, 

' . 

If one adverjsary dwells solely on test-scores and the other deal^exclusiviely; 
with financial aspects of the program, the potential advantages of the adversary 
methpd are seriously diluted. ' « 

Credibility 1 . Thrice are instances where a program is so controversial 
* that no evaluation of it will be believed unless it cagt be shown .unequivocally 

thfet thjtfevaluation m&de every effort to represent fairly* both sides of ttfe issue. 

* . * : ' " " - I 

a This is often true where previous evaluations of tne-program have been condemned 

as 1 one-sided or discounted on grounds of evUuator bias. Here the adversary 

H ■'***• 

approach comes into its own with its buiU-in Neutrality (or balanced bias) which 

f > ' * ♦ 

allows-both sides of an issue to be well illuminated. <Tf ^ - - * 



A related feature of the adversary approach is its potential for diffusing 

- political heat surrounding an evaluation* Some evaluators have privately 

proposed that the best place for this approach might be the "hot potato" 

. evaluations where the* evaluator will be pilloried no matter whicli wagr the 

results come out. \ As erne wag put it # , "It's hard t6 claim an evaluation is 

Wrong when it argues both sides of^the issue*" There may be some truth in 

that bit of facetiousness, since the 3 on 2 .evaluation was conducted in a 

political inferno and not only survived but was generally acclaimed in wide 

* .* 

press coverage as "unbiased," "a comprehensive study," and a "balanced 
evaluation.". At least no one claimed that the evaluation was biase^d^ and the ^ 
heated exchanges and dual recommendations provided all the fodder necessary, 
for the administrative and political decision makers. The evaluators did not 
get drawn back into the fra!y to defendrtfcommendations which t were under 

attack. ;Those recommendations had already been attacked within the 

< / - 

« ' / 

evaluation. Not "that the evaluation was not -criticized — q£e legislator went 
so far as toprinfe an attack against both sides of the evaluation, for using 
"disembodied statistics" and tests that would have received higher scores 

* * 

from the "intoxicated and tightly controlled students of Nazi Germany. " .Yes, 

even the adversary approach fails to dispel some folks* distrust of anything 

as anti-humanitarian as. a test item. 

\ » 

* «. 

Courageous Clients . By now it should be apparent that not all ad minis- 

. '* • • ■ ■ 

trators are likely to have the heart to initiate adversary evaluations of their* 



programs. Hiscox and Owens (1975, „ p. 6)' found. that • 



"♦ . . some administrators indicated that they would not be interested 
in usibg an adversary hearing as a decision-making tool. . They felf 
that many of their decisions were based largely on-personal experience, 
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were not to be resolved publicly or had political overtones' so that* 
> a logical decision based on" solicited facts might not be adequate 
• for their fteeds/' * . »/ f 

Such administrators probably would prefer avoiding any evaluation at all. 

They should be*ttoubly tempted to /avoid an adversary, evaluation, where few 

* •* t * • 

stones remain unturned. The competitive nature and processes of adversary 

proceedings also, make them less predictable than more standard approaches* 

* , 

An administrator who willingly requests an adversary evaluation is either a . 

self-confident and visionary leader or uninformed about the approach. % ■ 

Costs % j\kich of the informal dialogue "among ^valuators? the past two or; % 

'three years has questioned whether the adversary approach is worth its 

considerable costs. In fairness, the cost depends on how you play the game. 

If a full-blown courtroom procedure is employed, the cost is likely to be 

proportionately quite high for the amount of data produced. ,If every phased 

» » ' • ■ i 

of an evaluation involves the kind of two-party cross-checking Wright and _ , 
* t * * , ^/ 

Sachse (1977) describe, the resultant data may^be better, but you nked not be 
a mathematical whiz to predict the cost will double. Even a simple deWateLjaa 
a vehicle for presenting findings from a |t&ndard evaluation is an added expefc 

Tiie real question, hovyever, is not cost but cost-effectiveness or cost- 
benefit. On these dimensions, it seems apparent that benefit must be argued 
on grounds that adversary evaluation increases things like representativeness 



^of the data, fairness of the instrumerlts^communication between evjaluators 

and decision makers, and identification of all the prosed cons. Whether 

. 'h • <• \ I . ■ 

adversary Evaluation xeally. provides more benefits will remain an,' open 

questipn until'someone sees fit toTesearbh the issue. In the interim, the 

f survey of Hawaii educators is provocative. When asked if they felt the 



* infofmattoa produced by the adversary approach w as wo?th*tjj£ cost ophavfag t * 

* ■ * * . " * r \ ° . f ' 

two teams involved, 78 percent sald^t was, and another 15 percent sraSd is 



was wotth more than' the. cost. Only ?..pereent felt it was not worth c the money* 
Of course, these reactions should not be generalize^pyQnd the evaluation 
to which the/were reacting, but they do demonstrate tfi$:t even a, relatively ■ 
costly adversary evaluation can be viewed as worth the tost. J ;' % tt * ** 



fr/ How Should. Adversary Evaluations be Conducted? 



This-paper will not address this-issue satisfactorily, for^th&re mustqe, 

/ _ ~ ■ . .* ' "^Hf \ ' „ 

.at least as many answers as there are different approache^one might* make 

; -1 A 

to adversary evaluation. The best method for conducting -an evaluation with* , 
two independent adversary teams with separate budgets, is obvious!^ different 
fromtfeat for using a debate as /an enlightening way to present a Standard ' 
evaluation report*. Rather than speculate on How to conduct* stfch varjatioas, 
it might be helpful to list some critical features in the 3 on 2 evaluation'since 

it represents one view of how an adversary evaluation might be structured. 

' f j ■ f - 

. First, two evenly balanced teams were 'formed. 



Second, bothf team£ worked together to create the best possible design 
and choose the best possible instrument^to provide the .common core offlata, 
to be shared by both teams. The intentTie?Afis to develop ^otr^rehensive 
data base that wotjjd be accepted as valid by both teams. 1 ^ The thought was to - 
get all persons to think about information needed by both adversaries before 
they kn§w w&i<^ position they would represent r « v 



9 Also, it might be reinierated that the basic, adversary concept can be implemented 
without the heavy costs associated with some, of the approaches di^cus^ed earlier. 

10 The fact that this intent was not realized, and one team dhose to^attack ;he tests 
does noj negate the usefulness of this point. It merely underscores ^he need for 
clearer and firflaer ground rules from the outset. * • 



* « * 

Third, teams were assigned to adversary positions for the balance of the 
evaluation. Data collection ancl analysis were mostly joint efforts, with checks 
and balances built in td prevent either team from influencing the outcomes in 
their favor. Reporting was decidedlj' adversarial with a written and live Rebate 
'format, buttressed* by a neutral technical Report* • 

Wright at^gacS'se (1077) have described several phases of evaluation „ 
during which accessary input is useful. In our view, the adversary approach 

reaches its zenith in the reporting stage* Much of .the p.ositive reception to 

o ' c fc . ■ r. 

the Hawaii 3 on 2 .evaluation is probably attributable to the report format. 

The interest and positive effect stimulated by the pAiblic debate format must 

be viewed as considerable sitice the findings of our studygenerally parallel 

"those of the previous evaluation which was soundly censured two years earlier. 

There ar e e 'probably many sensible approaches which could still tal^e 

* * * 

advantage of the adversary report format, while streamlining the .process 
and cutting coats* ?Qr example, . one individual or team could conduct the 
entire evaluation, with twb outSfiiers assigned to present the advocate and 
adversary cases from the data generated by the evaluation* The same 

outsiders could also obviously be called iq. earlier to ensure balance in the 

* > « 

choice -of variables and instruments, ch^cl? the desigrifor fairness, and so " 

forth. 

, « \ Other^possibilities are left to the imaginatiori of tljg/reader. ' 

Do Adequate Guidelines Exist for Use in»Managing Adversary Evaluations? 

* 

Given the newness-^ adversary evaluation; it would be fbolish to expect 
adequate guidelines to have emerged for any of the variations which have been 
proposed. Some extant paradigms (e. g. , debate or courtroom models), do 



hkve carefully prescribed operating jjm^elines, but they have to be bent so 

<far to fit educational evaluations 'that thejl become largely inapplicable. 
* * * If 

Other de novo approaches which hav$ been developed have even fewer 

procedural guidelines to suggest. ** 

* » ■ 

To further'dialogue/in'this area, a feiy administrative or planning guide- ' 
lilies which our experience and observation suggest would ^e relevant in any 
. adversary evaluation are listed- below. y 

First, we share Popham ! s view that the director must be concerned with 
achieving as much balance as possible* in the relevant skills and strengths 
of the adversaries. » 

Second, it is parampunt that the evaluation grouch npes which will apply 



in the study be spelled out in specific detail befote^tfe^^aluation begins. 
Such ground rules must he in place'and agreed to by all parties prior to the 
time that partisan positions are assigned. Decisions about admissibility and 
validity of evidence, should be agreed to early -and adhered to throughout^ < 
Decide early on this role of the judge or the arbiters. What .criteria constitutes 
an objection that should be sustained? What rules govern how far arbiters can 
go in insisting that* all arguments that draw on the data be adequately documented, 
or that claims not supported by data be removed? Sufficient attention' given to 
spelling out' such ground rules adequately at the outset of an adversary evaluation 
will avoid many problems later on. Do" not assume that general S 1 '?^' rules - 
will suffice and that usual collegial congeniality will make compromise and 
resolution simple in areas overlooked in initial guidelines. In our experience, 
it seems unrealistic to expect such behavior in a confrontivfe methodology 
calculated to create opposition. 



w. * -« - • V 

, The need for specific guidelines for each equation is important in view 

* * , of the* absence of a body of procedural canons and guidelines. Attention to 

this step at the outset .may help to jsjplve Popham and Carlson's concern over . 
the absence of appellate mechanisms for adversary evaluations. ' 

Another area in need of careful management in adversary evaluations* 
involves decisions about data to be collected. Kourlisky and Baker (1976) 
noted that adversary' evaluation results in longer evaluation reports and , 

/ • • 

*» requires the development of guidelines for x cart^fu\ but parsimonious reporting. . 

r ■ '■ \ 

The Haw aft *3 on 2 experience corroborated theirs, for adversary team members 

* tended to collect or request a goc^cl deal of data without adequate plans for its 
use. Part of the problem was an apparent reluctance lb allow their opponents 

/ . 

to get aliead 'in th& data-aggregation game'. The result was that a fair amount . 

of data was underutilized. 

• ^ * 

v > There are many other guidelines that might be suggested', but brevity 

requires that we quickly move to^cmr final twc* main points* 

8. Does Adversarial Evaluation Alter the Nature of Evaluation Ethics ? ' 

The field of educational evaluation does n6t yet have an articulated, 

formalized code of ethics. The work of the committee on e validation standards 

empaneled jointly by AERA and other professional associations (StuSQpbeam, 

1977) is directly relevant, but it is still too early to tell just how much guidance 



that effort will give in the area*jof ethicai'pr'actices. In the interim, there does 

sjeem to be general Agreement among most evaluators on certain minimum 

essentials of ethical behavior, and at the heart of these Iiqs venerable d 

• * » ' 

principle! such as impartiality and neutrality. 
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1 • * * 

\ t In the typical educational evaluation, ^valuators are to be neutral and - 
impartial, leading to a fair unprejudiced evaluation. The.evaluator's role t [" # 
is roughly parallel to that of a judge, where impartiality is the' sine qua - non . 
in weighing'fhe evidence. 11 In adversary evaluation, it is only the overall 
^ structure and process which is obliged to be imoartial. The evaluators are m 

> y . 

• intentionally partisan and their roles approximate that of lawyers, where .\ 
'neutrality gives way to advocacy, * 

Theoretically, shifting from non-adversary to adversary evaluation does 
• * > • 

not lessen the impartiality with which decisions and judgments Will be made % 

But it most decidedlyj/orce^the individual evaluator to put aside reverence 

j * for persona* impartiality and adopt standard^of behavior more like those 

£oHT5wStH^ debaters and attorneys. 12 Js thajt goodVDr is the sudden shift 

to a new role Wsruptive arid dysfunctional for evaluators? Frankly, we httve 

'no'idea. It seems u$i£ely^^^ frtt^b? permanently damaged 



/ 



V 



\ . 



by occasional forays into fields vfafre different standards are followed. "But 
/ .fmight adversarial behavior prove addictive*HSfiaking the tough job of remaining 

' >' \ ^ / . ■. ; V-x , * • - 

4 ti8tprejudic5d.in no£-adversary evaluations evpif tougher forthe e^aluatpr who 

travf^ses the ethical boundary too* frequently? Tijpte w\ll telL , \ ' ^ % * * 

f • * ' " 



CP 



<0 V L " 7 • 

* A^thougfrhiot et^cal^lfeisiderations jpgfr se, it is4nteitesting tc^note the. 



//behavioral modifications which are sometim4^vvrought by adyersarjfcontests- 



"According to Wright and Sachse, such^impartia^ity may exist mo$e \n folklore - 
" than fact * We 'agree that evaluators' are fallible, but are unconvinced that they fail 
to be impartial*^ frequently as our colleagues' rhetoric implies/ £ • * 

12 In this polt-Watetgate period where potshots at attorneys are a lamentable . 
nationarjpastime, we stress that our reference here is to'personal impartially, ' / « 
not personal integrity. % - < , . 
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Win/lose situations can tax one's professionally and it is a credit to tljp Hawaii 
3 on 5 evaluation team leaders that throughout the 3 on 2 debates they refrained 
from challenging one another's integrity and ancestry. Of course, that might, 
have been partly due to the' rule we were forced to make early in the evaluation that 
adversaries could not make disparaging remarks about^pne another's mothers. 

Adversary evaluation also provides, memorable mon^, .like hearing a , 
hard-nose^-e^npiricist whimsically scold his opp f cfaeqt for presenting a "data- 
dtenched report. ,k Another was watching newsmen scurry for the telephones 
when, one adversary referred to the program as "a beautiful dream from 1968 
that reme&Bd only a be^tiful drefam eight years later. " (That bit of prose 
m^^^Hl^fel^iB-kll the Honolulu capers.) And then there was the emotional 



maj^^^^^^pe.advarsa-ry asked for special indulgence froxtl the "jury" 
because Jip.wp.s sm^and bald. * If-nothing else, the adversary approach is 
•hardly boring. 

A re t Educational Evaluators Competent to .Conduct Advocacy Evaluations ? 

TJiis issue cannot be addressed well until somepne completes a careful 
analysis of the skills and knowledge required of evaluations in thervarious 
adversary approaches. In the meantime,* predicting^ho'will make good, 
adversary evaluatioris musf: be categorized with.dther forms of crystal ball 
gazing/ There are basic considerations like l&chrtical ability, communications 

skills, and general ability, tote, sure, but those are too gross to hfe very_ ' 

\ *** 

helpful, ijack of information about what skills are needed also makes it 
difficult to develop Criteria to measure l»w well an adversary evaluator is 
performing- 'Adding adversary skills to the repertoire of techniques provided . 



in traiaiag programs for educational evaluators will also be impeded until 
better information is available. 

•* 

In the meantime, based on our limited observations of adversary evaluations, 

* \ . > ' 

we suspect thajt most educational evaluators are not well^prepared to play the 

/« 

adversarial role, especially if the legal model is adopted. Hiscox (1976) 



♦ 



noted the following problems in previous adversary hearings: 

r . ^ * • ^ . 

1. Political and professional considerations make it difficult for' j 
educator-advocates to attack incompetence of statements and 

. evaluations presented as evidence. 

, / } 

2. ' Educator-advocates fail to appear adversarial; they, often make 

points foitthe other side with their questions and/or evidence. 

3. , People unskilled in soliciting "testimony" often get rambling, 

unproductive evidence. 

We have noticed similar tendencies in educators we have watched function 
in adversary hearings based loosely on the legal model. Questioning skills , 
were notably lacking and witnesses were permitted to ramble in long monologs 
that addressed the questions" indirectly, if at all. Vrobing of obvious contra- 
dictions in testimony usually stopped short of highligntipg the contradiction, 

4k * \ ■ 

as- if the taaost important thing were to avoid embarrassing the witness. The 

• * relevance of testimony to major issues in the case was often left obscure. 

. Educators may be able to function more readily in the debate model, but 
> ' . * -V. " . - ' 

even tljis bit of optimism is*mostly speculative. If adversary evaluation is 

\ ' , : , 

to become a potent force in educational evaluation, more t^oiight must be 
given to defining and providing training in this area?v 
Conclusion * v 

We have discussed x what we believe to be some major potentials 1 and pitfalls 

x \ • • ' • 

of adversary evaluation. We have expressed our suspicion that tljfe aourtroom 



model may have limited utility for adversary evaluations in education, and we have 
pointed out difficulties that seem inherent in the debate model.- We have argued 
that the existence of opposing viewpoints is the core of adversary evaluation, not 
adherence -to existing formats, for presenting contrasting views, We have suggested 
that educational e valuators might develoo niore appropriate adversary methods 

tailored specifically for the field of education. We have addressed nine issues 

'« / 

which should he considered by^rxyone intending to use the adversary- approach. 

Where our analysis has been critical, it is prompted by a desire to se,e 

improvements in an approach which we feel could be very useful in selected 

evaluation settings. Adversary evaluation seems to hold considerable promise 

for improving the data base on which important educational decisions are made — 
* ■* 

if the pitfalls we have outlirfScT can be resolved.* 
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